17.1 DNA Sequencing
253
the unknown DNA. 9 One should also note inexpensive methods designed to detect
the presence of a mutation in a sequence; steady progress in automation is enabling
ever larger pieces of DNA to be tackled.
17.1.4
Expressed Sequence Tags
Expressed sequence tags (ESTs) are derived from the cDNA complementary to
mRNA. They consist of the sequence of typically 200–600 bases of a gene, suf-
ficient to uniquely identify the gene. The importance of ESTs is, however, tending
to diminish as sequencing methods become more powerful.
Expressed sequence tags are generated by isolating the mRNA from a particular
cell line or tissue and reverse-transcribing it into cDNA, which is then cloned into a
vector to make a “library”. 10 Some 400 bases from the ends of individual clones are
then sequenced.
If they overlap, ESTs can be used to reconstruct the whole sequence as in shotgun
sequencing, but their primary use is to facilitate the rapid identification of DNA.
For various reasons, not least low-fidelity transcription, the sequences are typically
considerably less reliable than those generated by conventional gene sequencing.
17.1.5
Next Generation Sequencing
With next (or second) generation sequencing (NGS, also known as massively parallel
sequencing or deep sequencing), an entire human genome can be sequenced within
a few hours in the most favourable cases, compared with the ten years or so required
to produce the first final draft of the human genome using conventional Sanger
sequencing. 11 The principle of NGS is not, however, very different from that of the
Sanger method—essentially it is a parallelization of the latter.
In NGS, the DNA is randomly fragmented, either enzymatically or by sonication.
Synthetic double-stranded oligonucleotides of known sequences are attached to the
fragments (adapter ligation) with the help of DNA ligase. The adapters enable the
fragments to become bound to a planar array of complementary counterparts. The
collection of fragments is known as a “library”.
The library must then be “amplified”, using the PCR (Sect. 17.1.2), meaning
the making of many copies of each fragment (in order to ensure sufficiently strong
signals from the subsequent sequencing). Reaction conditions are chosen to favour
the formation of clusters of identical strands.
9 See França et al. (2002) for a review, and Braslavsky et al. (2003) for a single-molecule technique.
10 In this context, “library” is used merely to denote “collection”.
11 The Human Genome Project was completed in 2003; NGS was introduced in 2005.